Dataset statistics
| Number of variables | 11 |
|---|---|
| Number of observations | 462 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 39.8 KiB |
| Average record size in memory | 88.3 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 2 |
adiposity is highly correlated with obesity and 1 other fields | High correlation |
obesity is highly correlated with adiposity | High correlation |
age is highly correlated with tobacco and 1 other fields | High correlation |
tobacco is highly correlated with age | High correlation |
names is uniformly distributed | Uniform |
names has unique values | Unique |
tobacco has 107 (23.2%) zeros | Zeros |
alcohol has 110 (23.8%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-01 20:24:22.426650 |
|---|---|
| Analysis finished | 2022-11-01 20:24:40.103651 |
| Duration | 17.68 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 462 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 231.9350649 |
| Minimum | 1 |
|---|---|
| Maximum | 463 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 24.05 |
| Q1 | 116.25 |
| median | 231.5 |
| Q3 | 347.75 |
| 95-th percentile | 439.95 |
| Maximum | 463 |
| Range | 462 |
| Interquartile range (IQR) | 231.5 |
Descriptive statistics
| Standard deviation | 133.9385851 |
|---|---|
| Coefficient of variation (CV) | 0.5774831207 |
| Kurtosis | -1.203538107 |
| Mean | 231.9350649 |
| Median Absolute Deviation (MAD) | 116 |
| Skewness | 0.001436279421 |
| Sum | 107154 |
| Variance | 17939.54458 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 1 | 0.2% |
| 319 | 1 | 0.2% |
| 317 | 1 | 0.2% |
| 316 | 1 | 0.2% |
| 315 | 1 | 0.2% |
| 314 | 1 | 0.2% |
| 313 | 1 | 0.2% |
| 312 | 1 | 0.2% |
| 311 | 1 | 0.2% |
| 310 | 1 | 0.2% |
| Other values (452) | 452 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 463 | 1 | |
| 462 | 1 | |
| 461 | 1 | |
| 460 | 1 | |
| 459 | 1 | |
| 458 | 1 | |
| 457 | 1 | |
| 456 | 1 | |
| 455 | 1 | |
| 454 | 1 |
sbp
Real number (ℝ≥0)
| Distinct | 62 |
|---|---|
| Distinct (%) | 13.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.532204944 × 10-5 |
| Minimum | 2.104199983 × 10-5 |
|---|---|
| Maximum | 9.802960494 × 10-5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 2.104199983 × 10-5 |
|---|---|
| 5-th percentile | 3.228305785 × 10-5 |
| Q1 | 4.565376187 × 10-5 |
| median | 5.56916908 × 10-5 |
| Q3 | 6.50364204 × 10-5 |
| 95-th percentile | 7.971938776 × 10-5 |
| Maximum | 9.802960494 × 10-5 |
| Range | 7.698760511 × 10-5 |
| Interquartile range (IQR) | 1.938265853 × 10-5 |
Descriptive statistics
| Standard deviation | 1.430315091 × 10-5 |
|---|---|
| Coefficient of variation (CV) | 0.2585434027 |
| Kurtosis | -0.1226353987 |
| Mean | 5.532204944 × 10-5 |
| Median Absolute Deviation (MAD) | 9.344729596 × 10-6 |
| Skewness | 0.063303759 |
| Sum | 0.02555878684 |
| Variance | 2.045801259 × 10-10 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 5.406574394 × 10-5 | 29 | 6.3% |
| 5.56916908 × 10-5 | 29 | 6.3% |
| 6.103515625 × 10-5 | 25 | 5.4% |
| 5.739210285 × 10-5 | 24 | 5.2% |
| 7.181844298 × 10-5 | 21 | 4.5% |
| 6.50364204 × 10-5 | 21 | 4.5% |
| 5.917159763 × 10-5 | 20 | 4.3% |
| 6.298815823 × 10-5 | 20 | 4.3% |
| 5.25099769 × 10-5 | 18 | 3.9% |
| 6.718624026 × 10-5 | 17 | 3.7% |
| Other values (52) | 238 |
| Value | Count | Frequency (%) |
| 2.104199983 × 10-5 | 1 | 0.2% |
| 2.143347051 × 10-5 | 1 | 0.2% |
| 2.183596821 × 10-5 | 1 | 0.2% |
| 2.311390533 × 10-5 | 3 | |
| 2.356489773 × 10-5 | 2 | |
| 2.5 × 10-5 | 1 | 0.2% |
| 2.550760127 × 10-5 | 1 | 0.2% |
| 2.657030503 × 10-5 | 2 | |
| 2.770083102 × 10-5 | 2 | |
| 2.829334541 × 10-5 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 9.802960494 × 10-5 | 1 | 0.2% |
| 9.611687812 × 10-5 | 1 | 0.2% |
| 9.425959091 × 10-5 | 1 | 0.2% |
| 8.8999644 × 10-5 | 3 | 0.6% |
| 8.573388203 × 10-5 | 7 | |
| 8.416799933 × 10-5 | 1 | 0.2% |
| 8.26446281 × 10-5 | 4 | 0.9% |
| 7.971938776 × 10-5 | 7 | |
| 7.694675285 × 10-5 | 12 | |
| 7.431629013 × 10-5 | 8 |
| Distinct | 214 |
|---|---|
| Distinct (%) | 46.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.260969685 |
| Minimum | 0 |
|---|---|
| Maximum | 3.959695934 |
| Zeros | 107 |
| Zeros (%) | 23.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.3074151682 |
| median | 1.319507911 |
| Q3 | 1.977630156 |
| 95-th percentile | 2.745518255 |
| Maximum | 3.959695934 |
| Range | 3.959695934 |
| Interquartile range (IQR) | 1.670214988 |
Descriptive statistics
| Standard deviation | 0.9426485881 |
|---|---|
| Coefficient of variation (CV) | 0.7475584857 |
| Kurtosis | -0.8942315863 |
| Mean | 1.260969685 |
| Median Absolute Deviation (MAD) | 0.7281646003 |
| Skewness | 0.1347283054 |
| Sum | 582.5679944 |
| Variance | 0.8885863606 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 107 | 23.2% |
| 2.047672511 | 11 | 2.4% |
| 1.551845574 | 10 | 2.2% |
| 0.6931448432 | 8 | 1.7% |
| 1.741101127 | 8 | 1.7% |
| 1.825093026 | 7 | 1.5% |
| 1.775414311 | 7 | 1.5% |
| 2.701920077 | 5 | 1.1% |
| 0.8151931096 | 5 | 1.1% |
| 1.319507911 | 5 | 1.1% |
| Other values (204) | 289 |
| Value | Count | Frequency (%) |
| 0 | 107 | |
| 0.1584893192 | 1 | 0.2% |
| 0.2091279105 | 1 | 0.2% |
| 0.2459509486 | 1 | 0.2% |
| 0.2759459323 | 2 | 0.4% |
| 0.3017088168 | 4 | 0.9% |
| 0.3245342223 | 1 | 0.2% |
| 0.3451749066 | 1 | 0.2% |
| 0.3641128406 | 2 | 0.4% |
| 0.381677891 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 3.959695934 | 1 | |
| 3.759241489 | 1 | |
| 3.624478073 | 1 | |
| 3.314454017 | 2 | |
| 3.287777572 | 1 | |
| 3.277689744 | 1 | |
| 3.260772438 | 1 | |
| 3.191747708 | 1 | |
| 3.177671523 | 1 | |
| 3.031433133 | 1 |
ldl
Real number (ℝ≥0)
| Distinct | 329 |
|---|---|
| Distinct (%) | 71.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.159086711 |
| Minimum | 0.9979817686 |
|---|---|
| Maximum | 1.313875503 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 0.9979817686 |
|---|---|
| 5-th percentile | 1.081762775 |
| Q1 | 1.126212749 |
| median | 1.158107763 |
| Q3 | 1.191976466 |
| 95-th percentile | 1.237224221 |
| Maximum | 1.313875503 |
| Range | 0.3158937347 |
| Interquartile range (IQR) | 0.06576371711 |
Descriptive statistics
| Standard deviation | 0.04911563628 |
|---|---|
| Coefficient of variation (CV) | 0.04237442792 |
| Kurtosis | 0.1886351067 |
| Mean | 1.159086711 |
| Median Absolute Deviation (MAD) | 0.03285621182 |
| Skewness | 0.02874546156 |
| Sum | 535.4980604 |
| Variance | 0.002412345727 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1.158905819 | 5 | 1.1% |
| 1.147254341 | 5 | 1.1% |
| 1.135708357 | 5 | 1.1% |
| 1.136026083 | 4 | 0.9% |
| 1.091493426 | 4 | 0.9% |
| 1.153212478 | 4 | 0.9% |
| 1.12681182 | 4 | 0.9% |
| 1.188641364 | 3 | 0.6% |
| 1.123349763 | 3 | 0.6% |
| 1.122292153 | 3 | 0.6% |
| Other values (319) | 422 |
| Value | Count | Frequency (%) |
| 0.9979817686 | 1 | |
| 1.006788805 | 1 | |
| 1.036414794 | 1 | |
| 1.044800014 | 1 | |
| 1.047465463 | 1 | |
| 1.055114548 | 1 | |
| 1.055729956 | 1 | |
| 1.056951172 | 1 | |
| 1.058759516 | 1 | |
| 1.060540482 | 1 |
| Value | Count | Frequency (%) |
| 1.313875503 | 1 | |
| 1.303485863 | 1 | |
| 1.286507018 | 1 | |
| 1.28090873 | 1 | |
| 1.277859841 | 1 | |
| 1.275641279 | 1 | |
| 1.274631487 | 1 | |
| 1.272932331 | 1 | |
| 1.266043322 | 1 | |
| 1.265443727 | 1 |
| Distinct | 408 |
|---|---|
| Distinct (%) | 88.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25.4067316 |
| Minimum | 6.74 |
|---|---|
| Maximum | 42.49 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 6.74 |
|---|---|
| 5-th percentile | 12.0065 |
| Q1 | 19.775 |
| median | 26.115 |
| Q3 | 31.2275 |
| 95-th percentile | 37.1165 |
| Maximum | 42.49 |
| Range | 35.75 |
| Interquartile range (IQR) | 11.4525 |
Descriptive statistics
| Standard deviation | 7.780698596 |
|---|---|
| Coefficient of variation (CV) | 0.306245554 |
| Kurtosis | -0.6984386244 |
| Mean | 25.4067316 |
| Median Absolute Deviation (MAD) | 5.7 |
| Skewness | -0.2146459286 |
| Sum | 11737.91 |
| Variance | 60.53927064 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 30.79 | 3 | 0.6% |
| 27.55 | 3 | 0.6% |
| 21.1 | 3 | 0.6% |
| 29.3 | 3 | 0.6% |
| 24.65 | 2 | 0.4% |
| 29.18 | 2 | 0.4% |
| 30.9 | 2 | 0.4% |
| 30.11 | 2 | 0.4% |
| 26.08 | 2 | 0.4% |
| 32.03 | 2 | 0.4% |
| Other values (398) | 438 |
| Value | Count | Frequency (%) |
| 6.74 | 1 | |
| 7.12 | 1 | |
| 8.66 | 1 | |
| 9.28 | 1 | |
| 9.37 | 1 | |
| 9.39 | 1 | |
| 9.64 | 1 | |
| 9.69 | 2 | |
| 9.74 | 1 | |
| 10.05 | 1 |
| Value | Count | Frequency (%) |
| 42.49 | 1 | |
| 42.17 | 1 | |
| 42.06 | 1 | |
| 41.05 | 1 | |
| 40.6 | 1 | |
| 39.97 | 1 | |
| 39.71 | 1 | |
| 39.68 | 1 | |
| 39.66 | 1 | |
| 39.64 | 1 |
famhist
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 KiB |
| Absent | |
|---|---|
| Present |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 6.415584416 |
| Min length | 6 |
Characters and Unicode
| Total characters | 2964 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Present |
|---|---|
| 2nd row | Absent |
| 3rd row | Present |
| 4th row | Present |
| 5th row | Present |
Common Values
| Value | Count | Frequency (%) |
| Absent | 270 | |
| Present | 192 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| absent | 270 | |
| present | 192 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 654 | |
| s | 462 | |
| n | 462 | |
| t | 462 | |
| A | 270 | |
| b | 270 | |
| P | 192 | 6.5% |
| r | 192 | 6.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2502 | |
| Uppercase Letter | 462 | 15.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 654 | |
| s | 462 | |
| n | 462 | |
| t | 462 | |
| b | 270 | |
| r | 192 | 7.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 270 | |
| P | 192 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2964 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 654 | |
| s | 462 | |
| n | 462 | |
| t | 462 | |
| A | 270 | |
| b | 270 | |
| P | 192 | 6.5% |
| r | 192 | 6.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2964 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 654 | |
| s | 462 | |
| n | 462 | |
| t | 462 | |
| A | 270 | |
| b | 270 | |
| P | 192 | 6.5% |
| r | 192 | 6.5% |
typea
Real number (ℝ≥0)
| Distinct | 54 |
|---|---|
| Distinct (%) | 11.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.1038961 |
| Minimum | 13 |
|---|---|
| Maximum | 78 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 13 |
|---|---|
| 5-th percentile | 36 |
| Q1 | 47 |
| median | 53 |
| Q3 | 60 |
| 95-th percentile | 69 |
| Maximum | 78 |
| Range | 65 |
| Interquartile range (IQR) | 13 |
Descriptive statistics
| Standard deviation | 9.817534116 |
|---|---|
| Coefficient of variation (CV) | 0.1848740834 |
| Kurtosis | 0.4704023399 |
| Mean | 53.1038961 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | -0.3464377547 |
| Sum | 24534 |
| Variance | 96.38397611 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 52 | 25 | 5.4% |
| 57 | 23 | 5.0% |
| 54 | 21 | 4.5% |
| 50 | 21 | 4.5% |
| 49 | 20 | 4.3% |
| 60 | 18 | 3.9% |
| 56 | 18 | 3.9% |
| 55 | 17 | 3.7% |
| 61 | 17 | 3.7% |
| 47 | 17 | 3.7% |
| Other values (44) | 265 |
| Value | Count | Frequency (%) |
| 13 | 1 | 0.2% |
| 20 | 1 | 0.2% |
| 25 | 1 | 0.2% |
| 26 | 1 | 0.2% |
| 28 | 1 | 0.2% |
| 29 | 1 | 0.2% |
| 30 | 2 | |
| 31 | 2 | |
| 32 | 1 | 0.2% |
| 33 | 4 |
| Value | Count | Frequency (%) |
| 78 | 1 | 0.2% |
| 77 | 1 | 0.2% |
| 75 | 1 | 0.2% |
| 74 | 2 | 0.4% |
| 73 | 2 | 0.4% |
| 72 | 4 | |
| 71 | 2 | 0.4% |
| 70 | 5 | |
| 69 | 7 | |
| 68 | 6 |
| Distinct | 400 |
|---|---|
| Distinct (%) | 86.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2733598884 |
| Minimum | 0.2151395254 |
|---|---|
| Maximum | 0.3412503191 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 0.2151395254 |
|---|---|
| 5-th percentile | 0.2465298384 |
| Q1 | 0.2618649047 |
| median | 0.2724698541 |
| Q3 | 0.285379676 |
| 95-th percentile | 0.3006890692 |
| Maximum | 0.3412503191 |
| Range | 0.1261107936 |
| Interquartile range (IQR) | 0.02351477135 |
Descriptive statistics
| Standard deviation | 0.01707294739 |
|---|---|
| Coefficient of variation (CV) | 0.06245593487 |
| Kurtosis | 0.5194706974 |
| Mean | 0.2733598884 |
| Median Absolute Deviation (MAD) | 0.01136377502 |
| Skewness | -0.0002731432448 |
| Sum | 126.2922684 |
| Variance | 0.0002914855325 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.2765664851 | 4 | 0.9% |
| 0.2712753724 | 4 | 0.9% |
| 0.2877728395 | 3 | 0.6% |
| 0.290740378 | 3 | 0.6% |
| 0.2760342845 | 3 | 0.6% |
| 0.2903701594 | 3 | 0.6% |
| 0.2664394851 | 3 | 0.6% |
| 0.2622241365 | 3 | 0.6% |
| 0.2716923988 | 3 | 0.6% |
| 0.2873647592 | 3 | 0.6% |
| Other values (390) | 430 |
| Value | Count | Frequency (%) |
| 0.2151395254 | 1 | |
| 0.216749204 | 1 | |
| 0.2247479836 | 1 | |
| 0.2278797018 | 1 | |
| 0.2314553958 | 1 | |
| 0.2341086104 | 1 | |
| 0.2348577613 | 1 | |
| 0.2352860241 | 1 | |
| 0.237286666 | 1 | |
| 0.2383360355 | 1 |
| Value | Count | Frequency (%) |
| 0.3412503191 | 1 | |
| 0.3164613359 | 1 | |
| 0.3160344542 | 1 | |
| 0.3154684005 | 1 | |
| 0.3122129618 | 1 | |
| 0.3115353414 | 1 | |
| 0.3112657305 | 1 | |
| 0.3095989586 | 1 | |
| 0.3069958387 | 1 | |
| 0.3060392129 | 1 |
| Distinct | 249 |
|---|---|
| Distinct (%) | 53.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.263569291 |
| Minimum | 0 |
|---|---|
| Maximum | 7.364636603 |
| Zeros | 110 |
| Zeros (%) | 23.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.7638851554 |
| median | 2.239993375 |
| Q3 | 3.558795126 |
| 95-th percentile | 5.370803223 |
| Maximum | 7.364636603 |
| Range | 7.364636603 |
| Interquartile range (IQR) | 2.794909971 |
Descriptive statistics
| Standard deviation | 1.799918389 |
|---|---|
| Coefficient of variation (CV) | 0.795168231 |
| Kurtosis | -0.6164388224 |
| Mean | 2.263569291 |
| Median Absolute Deviation (MAD) | 1.379176308 |
| Skewness | 0.4009449211 |
| Sum | 1045.769012 |
| Variance | 3.239706205 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 110 | 23.8% |
| 1.335201735 | 16 | 3.5% |
| 0.7638851554 | 8 | 1.7% |
| 2.906330482 | 5 | 1.1% |
| 4.510176095 | 5 | 1.1% |
| 2.619905482 | 5 | 1.1% |
| 2.323592329 | 5 | 1.1% |
| 2.334844707 | 5 | 1.1% |
| 1.760097511 | 4 | 0.9% |
| 1.707536478 | 4 | 0.9% |
| Other values (239) | 295 |
| Value | Count | Frequency (%) |
| 0 | 110 | |
| 0.5146375139 | 1 | 0.2% |
| 0.5834307824 | 1 | 0.2% |
| 0.6718629455 | 2 | 0.4% |
| 0.7638851554 | 8 | 1.7% |
| 0.8151931096 | 1 | 0.2% |
| 0.8570448806 | 2 | 0.4% |
| 0.8620642522 | 1 | 0.2% |
| 0.8865284716 | 2 | 0.4% |
| 0.9414545972 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 7.364636603 | 1 | |
| 7.326461799 | 1 | |
| 7.300372103 | 1 | |
| 6.787595021 | 1 | |
| 6.549994511 | 1 | |
| 6.506830627 | 1 | |
| 6.317641959 | 1 | |
| 6.238303588 | 1 | |
| 6.119020535 | 1 | |
| 6.074113131 | 1 |
| Distinct | 49 |
|---|---|
| Distinct (%) | 10.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 42.81601732 |
| Minimum | 15 |
|---|---|
| Maximum | 64 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 KiB |
Quantile statistics
| Minimum | 15 |
|---|---|
| 5-th percentile | 17 |
| Q1 | 31 |
| median | 45 |
| Q3 | 55 |
| 95-th percentile | 62 |
| Maximum | 64 |
| Range | 49 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 14.60895644 |
|---|---|
| Coefficient of variation (CV) | 0.3412030675 |
| Kurtosis | -1.01622901 |
| Mean | 42.81601732 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -0.3817342585 |
| Sum | 19781 |
| Variance | 213.4216084 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=49)
| Value | Count | Frequency (%) |
| 16 | 20 | 4.3% |
| 58 | 17 | 3.7% |
| 17 | 17 | 3.7% |
| 61 | 16 | 3.5% |
| 59 | 16 | 3.5% |
| 55 | 16 | 3.5% |
| 60 | 15 | 3.2% |
| 45 | 14 | 3.0% |
| 53 | 14 | 3.0% |
| 49 | 14 | 3.0% |
| Other values (39) | 303 |
| Value | Count | Frequency (%) |
| 15 | 3 | 0.6% |
| 16 | 20 | |
| 17 | 17 | |
| 18 | 8 | 1.7% |
| 19 | 2 | 0.4% |
| 20 | 6 | 1.3% |
| 21 | 3 | 0.6% |
| 23 | 2 | 0.4% |
| 24 | 6 | 1.3% |
| 25 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 64 | 13 | |
| 63 | 8 | |
| 62 | 12 | |
| 61 | 16 | |
| 60 | 15 | |
| 59 | 16 | |
| 58 | 17 | |
| 57 | 8 | |
| 56 | 9 | |
| 55 | 16 |
chd
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 462 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 160 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 160 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 160 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 462 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 160 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 462 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 160 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 462 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 160 |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| names | sbp | tobacco | ldl | adiposity | famhist | typea | obesity | alcohol | age | chd | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0.000039 | 2.701920 | 1.190736 | 23.11 | Present | 49 | 0.274632 | 6.238304 | 52 | 1 |
| 1 | 2 | 0.000048 | 0.158489 | 1.159962 | 28.61 | Absent | 55 | 0.260508 | 1.335202 | 63 | 1 |
| 2 | 3 | 0.000072 | 0.364113 | 1.132812 | 32.28 | Present | 52 | 0.259540 | 1.707536 | 46 | 0 |
| 3 | 4 | 0.000035 | 2.238847 | 1.204164 | 38.03 | Present | 51 | 0.250031 | 3.580604 | 58 | 1 |
| 4 | 5 | 0.000056 | 2.840636 | 1.133462 | 27.78 | Present | 60 | 0.271692 | 5.051066 | 49 | 1 |
| 5 | 6 | 0.000057 | 2.074707 | 1.205287 | 36.21 | Present | 62 | 0.253950 | 2.885226 | 45 | 0 |
| 6 | 7 | 0.000050 | 1.749774 | 1.129514 | 16.20 | Absent | 59 | 0.296955 | 1.470011 | 38 | 0 |
| 7 | 8 | 0.000077 | 1.754947 | 1.164612 | 14.60 | Present | 62 | 0.284761 | 2.142633 | 58 | 1 |
| 8 | 9 | 0.000077 | 0.000000 | 1.143720 | 19.40 | Present | 49 | 0.276566 | 1.440389 | 29 | 0 |
| 9 | 10 | 0.000057 | 0.000000 | 1.192183 | 30.96 | Present | 69 | 0.256163 | 0.000000 | 53 | 1 |
Last rows
| names | sbp | tobacco | ldl | adiposity | famhist | typea | obesity | alcohol | age | chd | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 452 | 454 | 0.000042 | 1.981938 | 1.123350 | 28.81 | Present | 61 | 0.271026 | 4.493005 | 42 | 0 |
| 453 | 455 | 0.000065 | 1.206835 | 1.218579 | 39.68 | Present | 36 | 0.251580 | 0.000000 | 51 | 1 |
| 454 | 456 | 0.000047 | 0.836512 | 1.170320 | 28.02 | Absent | 60 | 0.263303 | 2.323592 | 39 | 1 |
| 455 | 457 | 0.000061 | 1.380700 | 1.109631 | 26.48 | Absent | 48 | 0.280676 | 4.681496 | 27 | 1 |
| 456 | 458 | 0.000035 | 0.693145 | 1.151819 | 42.06 | Present | 56 | 0.246643 | 1.335202 | 57 | 0 |
| 457 | 459 | 0.000022 | 0.693145 | 1.195832 | 31.72 | Absent | 64 | 0.262040 | 0.000000 | 58 | 0 |
| 458 | 460 | 0.000030 | 1.775414 | 1.159962 | 32.10 | Absent | 52 | 0.261453 | 3.227917 | 52 | 1 |
| 459 | 461 | 0.000086 | 1.551846 | 1.047465 | 15.23 | Absent | 40 | 0.301167 | 3.717181 | 55 | 0 |
| 460 | 462 | 0.000072 | 1.963168 | 1.277860 | 30.79 | Absent | 64 | 0.266206 | 3.563422 | 40 | 0 |
| 461 | 463 | 0.000057 | 0.000000 | 1.170320 | 33.41 | Present | 62 | 0.341250 | 0.000000 | 46 | 1 |